Search CORE

6 research outputs found

Saliency Benchmarking Made Easy: Separating Models, Maps and Metrics

Author: A Borji
A Borji
A Borji
A Nuthmann
AM Treisman
BT Vincent
BW Tatler
BW Tatler
BW Tatler
CA Rothkopf
H Adeli
HH Schütt
J Xiao
K Koehler
L Itti
L Itti
L Itti
L Zhang
M Cerf
M Kümmerer
N Riche
N Riche
N Wilming
ND Bruce
NDB Bruce
O Meur Le
RJ Peters
S Barthelme
SSS Kruthiventi
T Jost
W Einhauser
W Kienzle
Z Bylinskii
Z Li
Publication venue
Publication date: 25/07/2018
Field of study

Dozens of new models on fixation prediction are published every year and compared on open benchmarks such as MIT300 and LSUN. However, progress in the field can be difficult to judge because models are compared using a variety of inconsistent metrics. Here we show that no single saliency map can perform well under all metrics. Instead, we propose a principled approach to solve the benchmarking problem by separating the notions of saliency models, maps and metrics. Inspired by Bayesian decision theory, we define a saliency model to be a probabilistic model of fixation density prediction and a saliency map to be a metric-specific prediction derived from the model density which maximizes the expected performance on that metric given the model density. We derive these optimal saliency maps for the most commonly used saliency metrics (AUC, sAUC, NSS, CC, SIM, KL-Div) and show that they can be computed analytically or approximated with high precision. We show that this leads to consistent rankings in all metrics and avoids the penalties of using one saliency map for all metrics. Our method allows researchers to have their model compete on many different metrics with state-of-the-art in those metrics: "good" models will perform well in all metrics.Comment: published at ECCV 201

arXiv.org e-Print Archive

Crossref

Unified Image and Video Saliency Modeling

Author: A Borji
A Rozantsev
C Bak
C Guo
HJ Seo
J Liu
L Itti
LVD Maaten
M Cornia
O Le Meur
Q Lai
S Marat
S Mathe
SSS Kruthiventi
V Leboran
V Mahadevan
W Wang
Y Fang
Y Sun
Z Bylinskii
Publication venue: 'Springer Science and Business Media LLC'
Publication date: 01/01/2020
Field of study

Visual saliency modeling for images and videos is treated as two independent tasks in recent computer vision literature. While image saliency modeling is a well-studied problem and progress on benchmarks like SALICON and MIT300 is slowing, video saliency models have shown rapid gains on the recent DHF1K benchmark. Here, we take a step back and ask: Can image and video saliency modeling be approached via a unified model, with mutual benefit? We identify different sources of domain shift between image and video saliency data and between different video saliency datasets as a key challenge for effective joint modelling. To address this we propose four novel domain adaptation techniques - Domain-Adaptive Priors, Domain-Adaptive Fusion, Domain-Adaptive Smoothing and Bypass-RNN - in addition to an improved formulation of learned Gaussian priors. We integrate these techniques into a simple and lightweight encoder-RNN-decoder-style network, UNISAL, and train it jointly with image and video saliency data. We evaluate our method on the video saliency datasets DHF1K, Hollywood-2 and UCF-Sports, and the image saliency datasets SALICON and MIT300. With one set of parameters, UNISAL achieves state-of-the-art performance on all video saliency datasets and is on par with the state-of-the-art for image saliency datasets, despite faster runtime and a 5 to 20-fold smaller model size compared to all competing deep methods. We provide retrospective analyses and ablation studies which confirm the importance of the domain shift modeling. The code is available at https://github.com/rdroste/unisalComment: Presented at the European Conference on Computer Vision (ECCV) 2020. R. Droste and J. Jiao contributed equally to this work. v3: Updated Fig. 5a) and added new MTI300 benchmark results to supp. materia

arXiv.org e-Print Archive

Crossref

University of Birmingham Research Portal

Oxford University Research Archive

MASON: A Model AgnoStic ObjectNess Framework

Author: A Robicquet
B. Alexe
C Rother
CL Zitnick
I Karoui
M Mueller
MD Zeiler
O Ronneberger
SSS Kruthiventi
T-Y Lin
W Liu
X Li
Publication venue: 'Springer Science and Business Media LLC'
Publication date: 01/01/2019
Field of study

This paper proposes a simple, yet very effective method to localize dominant foreground objects in an image, to pixel-level precision. The proposed method ‘MASON’ (Model-AgnoStic ObjectNess) uses a deep convolutional network to generate category-independent and model-agnostic heat maps for any image. The network is not explicitly trained for the task, and hence, can be used off-the-shelf in tandem with any other network or task. We show that this framework scales to a wide variety of images, and illustrate the effectiveness of MASON in three varied application contexts

Crossref

Research Archive of Indian Institute of Technology Hyderabad

Saliency prediction on omnidirectional images with attention-aware feature fusion network

Author: F Battisti
J Zhang
L Itti
M Cornia
M Startsev
O Le Meur
P Lebreton
R Monroy
RJ Peters
SSS Kruthiventi
V Sitzmann
W Wang
Y Zhu
Z-Q Zhao
Publication venue: 'Springer Science and Business Media LLC'
Publication date
Field of study

Crossref

CT image classification based on convolutional neural network

Author: E Dandıl
H Chen
I Ševo
M Anthimopoulos
MM Sebatubun
P Moeskops
PPR Filho
PPR Filho
S Valverde
SSM Salehi
SSS Kruthiventi
UR Acharya
X Ma
X Yin
Y Fang
Y Yoon
Z Wang
Publication venue: 'Springer Science and Business Media LLC'
Publication date
Field of study

Crossref

Personalized recommendation of film and television culture based on an intelligent classification algorithm

Author: A Yang
C Zhang
H Hong
H Luo
Hongyan Cong
J Chao
J Wu
K Zhang
KH Jin
KL Zhao
M Anthimopoulos
M Fu
M Polato
P Moeskops
SR Folkes
SSS Kruthiventi
V Kumar
W Liao
X Li
X Wei
X Xu
X Zhu
Y Wang
Y Yang
Y Zhou
Z Zijie
Publication venue: 'Springer Science and Business Media LLC'
Publication date
Field of study

Crossref